Search CORE

82 research outputs found

Automatic discovery of cross-family sequence features associated with protein function

Author: Brameier Markus
Haan Josien
Krings Andrea
MacCallum Robert M
Publication venue: BioMed Central
Publication date: 01/01/2006
Field of study

BACKGROUND: Methods for predicting protein function directly from amino acid sequences are useful tools in the study of uncharacterised protein families and in comparative genomics. Until now, this problem has been approached using machine learning techniques that attempt to predict membership, or otherwise, to predefined functional categories or subcellular locations. A potential drawback of this approach is that the human-designated functional classes may not accurately reflect the underlying biology, and consequently important sequence-to-function relationships may be missed. RESULTS: We show that a self-supervised data mining approach is able to find relationships between sequence features and functional annotations. No preconceived ideas about functional categories are required, and the training data is simply a set of protein sequences and their UniProt/Swiss-Prot annotations. The main technical aspect of the approach is the co-evolution of amino acid-based regular expressions and keyword-based logical expressions with genetic programming. Our experiments on a strictly non-redundant set of eukaryotic proteins reveal that the strongest and most easily detected sequence-to-function relationships are concerned with targeting to various cellular compartments, which is an area already well studied both experimentally and computationally. Of more interest are a number of broad functional roles which can also be correlated with sequence features. These include inhibition, biosynthesis, transcription and defence against bacteria. Despite substantial overlaps between these functions and their corresponding cellular compartments, we find clear differences in the sequence motifs used to predict some of these functions. For example, the presence of polyglutamine repeats appears to be linked more strongly to the "transcription" function than to the general "nuclear" function/location. CONCLUSION: We have developed a novel and useful approach for knowledge discovery in annotated sequence data. The technique is able to identify functionally important sequence features and does not require expert knowledge. By viewing protein function from a sequence perspective, the approach is also suitable for discovering unexpected links between biological processes, such as the recently discovered role of ubiquitination in transcription

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

A genome-wide survey for prion-regulated miRNAs associated with cholesterol homeostasis

Author: Ann-Christin Schmädicke
Dirk Motzkus
Hermann M Schätzl
Judith Montag
Markus Brameier
Sabine Gilch
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2012
Field of study

Crossref

Springer - Publisher Connector

Genome-wide comparative analysis of microRNAs in three non-human primates

Author: A Nahvi
B Zhang
D Bartel
D Gusfield
E Berezikov
E Lai
I Hofacker
J Brown
J Hertel
J Nam
J Yue
L He
L Lim
M Brameier
M Legendre
M Saunders
M Weber
Markus Brameier
R Raaum
S Altschul
S Griffths-Jones
U Ohler
V Ambros
V Baev
X Wang
Y Altuvia
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Crossref

Springer - Publisher Connector

PubMed Central

Mitogenomic phylogeny of the common long-tailed macaque (Macaca fascicularis fascicularis)

Author: Abdul-Latiff Muhammad AB
Abdul-Patah Pazil
Ampeng Ahmad
Brameier Markus
Böker Kai O
Kolleck Jakob
Lakim Maklarin
Liedigk Rasmus
Md-Zain Badrul Munir
Meijaard Erik
Roos Christian
Tosi Anthony J
Zinner Dietmar
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 29/11/2018
Field of study

Background Long-tailed macaques (Macaca fascicularis) are an important model species in biomedical research and reliable knowledge about their evolutionary history is essential for biomedical inferences. Ten subspecies have been recognized, of which most are restricted to small islands of Southeast Asia. In contrast, the common long-tailed macaque (M. f. fascicularis) is distributed over large parts of the Southeast Asian mainland and the Sundaland region. To shed more light on the phylogeny of M. f. fascicularis, we sequenced complete mitochondrial (mtDNA) genomes of 40 individuals from all over the taxon’s range, either by classical PCR-amplification and Sanger sequencing or by DNA-capture and high-throughput sequencing. Results Both laboratory approaches yielded complete mtDNA genomes from M. f. fascicularis with high accuracy and/or coverage. According to our phylogenetic reconstructions, M. f. fascicularis initially diverged into two clades 1.70 million years ago (Ma), with one including haplotypes from mainland Southeast Asia, the Malay Peninsula and North Sumatra (Clade A) and the other, haplotypes from the islands of Bangka, Java, Borneo, Timor, and the Philippines (Clade B). The three geographical populations of Clade A appear as paraphyletic groups, while local populations of Clade B form monophyletic clades with the exception of a Philippine individual which is nested within the Borneo clade. Further, in Clade B the branching pattern among main clades/lineages remains largely unresolved, most likely due to their relatively rapid diversification 0.93-0.84 Ma. Conclusions Both laboratory methods have proven to be powerful to generate complete mtDNA genome data with similarly high accuracy, with the DNA-capture and high-throughput sequencing approach as the most promising and only practical option to obtain such data from highly degraded DNA, in time and with relatively low costs. The application of complete mtDNA genomes yields new insights into the evolutionary history of M. f. fascicularis by providing a more robust phylogeny and more reliable divergence age estimations than earlier studies

The Australian National University

Ab initio identification of human microRNAs based on structure motifs

Author: A Rodriguez
A Sewer
C Xue
Carsten Wiuf
D Bartel
D Gusfield
E Berezikov
E Bonnet
E Lai
I Bentwich
I Hofacker
I Hofacker
J Han
J Krol
J Nam
L He
L Lim
L Lim
M Brameier
M Legendre
M Weber
Markus Brameier
P Jiang
P Saetrom
S Altschul
S Baskerville
S Griffiths-Jones
S Helvik
S Kwang Loong
S Ying
T Gingeras
U Ohler
V Ambros
W Ritchie
X Wang
Y Altuvia
Y Grad
Y Zeng
Publication venue: BioMed Central
Publication date: 01/01/2007
Field of study

Abstract Background MicroRNAs (miRNAs) are short, non-coding RNA molecules that are directly involved in post-transcriptional regulation of gene expression. The mature miRNA sequence binds to more or less specific target sites on the mRNA. Both their small size and sequence specificity make the detection of completely new miRNAs a challenging task. This cannot be based on sequence information alone, but requires structure information about the miRNA precursor. Unlike comparative genomics approaches, <it>ab initio </it>approaches are able to discover species-specific miRNAs without known sequence homology. Results MiRPred is a novel method for <it>ab initio </it>prediction of miRNAs by genome scanning that only relies on (predicted) secondary structure to distinguish miRNA precursors from other similar-sized segments of the human genome. We apply a machine learning technique, called linear genetic programming, to develop special classifier programs which include multiple regular expressions (motifs) matched against the secondary structure sequence. Special attention is paid to scanning issues. The classifiers are trained on fixed-length sequences as these occur when shifting a window in regular steps over a genome region. Various statistical and empirical evidence is collected to validate the correctness of and increase confidence in the predicted structures. Among other things, we propose a new criterion to select miRNA candidates with a higher stability of folding that is based on the number of matching windows around their genome location. An ensemble of 16 motif-based classifiers achieves 99.9 percent specificity with sensitivity remaining on an acceptable high level when requiring all classifiers to agree on a positive decision. A low false positive rate is considered more important than a low false negative rate, when searching larger genome regions for unknown miRNAs. 117 new miRNAs have been predicted close to known miRNAs on human chromosome 19. All candidate structures match the free energy distribution of miRNA precursors which is significantly shifted towards lower free energies. We employed a human EST library and found that around 75 percent of the candidate sequences are likely to be transcribed, with around 35 percent located in introns. Conclusion Our motif finding method is at least competitive to state-of-the-art feature-based methods for <it>ab initio </it>miRNA discovery. In doing so, it requires less previous knowledge about miRNA precursor structures while programs and motifs allow a more straightforward interpretation and extraction of the acquired knowledge.</p

Crossref

Springer

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Human box C/D snoRNAs with miRNA like functions: expanding the range of regulatory RNAs

Author: Antal
Aravin
Aravin
Astrid Herwig
Bachellerie
Castle
Chang
Cheloufi
Chen
Chen
Cifuentes
Clery
Collingwood
Culver
Denli
Dieci
Elbashir
Elbashir
Eliceiri
Ender
Filipowicz
Gan
Griffiths-Jones
Han
Heo
Hirose
Jens Gruber
Kawasaki
Kawasaki
Kim
Kim
Kishore
Kiss
Kiss
Kolev
Lagos-Quintana
Larkin
Leary
Lee
Lestrade
Lutz Walter
Maniataki
Markus Brameier
Matera
Mattick
Meister
Nabavi
Obernosterer
Richard Reinhardt
Ruby
Saetrom
Saito
Saraiya
Scott
Seto
Shabalina
Soifer
Song
Taft
Taft
Taft
Tarasov
Triboulet
Werner
Winter
Publication venue: Oxford University Press
Publication date: 01/01/2011
Field of study

Small nucleolar RNAs (snoRNAs) and microRNAs are two classes of non-protein-coding RNAs with distinct functions in RNA modification or post-transcriptional gene silencing. In this study, we introduce novel insights to RNA-induced gene activity adjustments in human cells by identifying numerous snoRNA-derived molecules with miRNA-like function, including H/ACA box snoRNAs and C/D box snoRNAs. In particular, we demonstrate that several C/D box snoRNAs give rise to gene regulatory RNAs, named sno-miRNAs here. Our data are complementing the increasing number of studies in the field of small RNAs with regulatory functions. In massively deep sequencing of small RNA fractions we identified high copy numbers of sub-sequences from >30 snoRNAs with lengths of ≥18 nt. RNA secondary structure prediction indicated for a majority of candidates a location in predicted stem regions. Experimental analysis revealed efficient gene silencing for 11 box C/D sno-miRNAs, indicating cytoplasmic processing and recruitment to the RNA silencing machinery. Assays in four different human cell lines indicated variations in both the snoRNA levels and their processing to active sno-miRNAs. In addition we show that box D elements are predominantly flanking at least one of the sno-miRNA strands, while the box C element locates within the sequence of the sno-miRNA guide strand

Crossref

PubMed Central

MPG.PuRe

On species delimitation: Yet another lemur species or just genetic variation?

Abstract Background Although most taxonomists agree that species are independently evolving metapopulation lineages that should be delimited with several kinds of data, the taxonomic practice in Malagasy primates (Lemuriformes) looks quite different. Several recently described lemur species are based solely on evidence of genetic distance and diagnostic characters of mitochondrial DNA sequences sampled from a few individuals per location. Here we explore the validity of this procedure for species delimitation in lemurs using published sequence data. Results We show that genetic distance estimates and <it>Population Aggregation Analysis </it>(PAA) are inappropriate for species delimitation in this group of primates. Intra- and interspecific genetic distances overlapped in 14 of 17 cases independent of the genetic marker used. A simulation of a fictive taxonomic study indicated that for the mitochondrial D-loop the minimum required number of individuals sampled per location is 10 in order to avoid false positives via PAA. Conclusions Genetic distances estimates and PAA alone should not be used for species delimitation in lemurs. Instead, several nuclear and sex-specific loci should be considered and combined with other data sets from morphology, ecology or behavior. Independent of the data source, sampling should be done in a way to ensure a quantitative comparison of intra- and interspecific variation of the taxa in question. The results of our study also indicate that several of the recently described lemur species should be reevaluated with additional data and that the number of good species among the currently known taxa is probably lower than currently assumed.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Nuclear versus mitochondrial DNA: evidence for hybridization in colobine monkeys

Author: A Rambaut
A Rambaut
AE Lebatard
AE Pusey
AG Davies
AH Salem
AH Salem
AJ Drummond
AJ Drummond
AJ Drummond
AJ Tosi
AM Shedlock
B Hallet
BR Benefit
C Darwin
C Roos
CB Stewart
CB Stewart
Christian Roos
Christiane Schwarz
CP Groves
CP Groves
D Brandon-Jones
D Chakraborty
D Funk
D Posada
D Zinner
D Zinner
DA Pollard
DA Ray
DA Ray
Dietmar Zinner
Dirk Meyer
DJ Zwickl
DL Swofford
Dyah Perwitasari-Farajallah
E Delson
E Mayr
E Meijaard
E Strasser
F Ronquist
Fabian H Leendertz
FS Szalay
H Kishino
H Philippe
H Shimodaira
IS Zalmout
J Castresana
J Kelley
J Li
J Schmitz
J Schmitz
J Xing
J Xing
J Xing
JC Avise
Jinchuan Xing
JP Huelsenbeck
JR Napier
K Katoh
K McCracken
KG Miller
KN Sterner
KP Burnham
KP Karanth
L Cortés-Ortiz
L Hellborg
Laura S Kubatko
LN Van de Lagemaat
LS Kubatko
LS Kubatko
LS Whitfield
Lutz Walter
M Brunet
M Goodman
M Osterholz
M Osterholz
MA Batzer
Mark A Batzer
Markus Brameier
Martin Osterholz
MG Leakey
ML Arnold
Mouyu Yang
N Okada
N Patterson
N Ting
N Ting
NG Jablonski
NG Jablonski
NH Barton
NM Young
O Seehausen
O Thalmann
P Vignaud
PJ Whybrow
R Chaves
R Nichols
RE Green
RJ Petit
RL Raaum
S Koblmüller
S Merker
S Pääbo
SK Wyman
Stephen D Nash
SW Herke
Thomas Ziegler
Tilo Nadler
VN Thinh
WP Maddison
Y Rumpler
YP Zhang
YZ Peng
Z Yang
Publication venue: BioMed Central
Publication date: 01/01/2011
Field of study

Abstract Background Colobine monkeys constitute a diverse group of primates with major radiations in Africa and Asia. However, phylogenetic relationships among genera are under debate, and recent molecular studies with incomplete taxon-sampling revealed discordant gene trees. To solve the evolutionary history of colobine genera and to determine causes for possible gene tree incongruences, we combined presence/absence analysis of mobile elements with autosomal, X chromosomal, Y chromosomal and mitochondrial sequence data from all recognized colobine genera. Results Gene tree topologies and divergence age estimates derived from different markers were similar, but differed in placing <it>Piliocolobus/Procolobus </it>and langur genera among colobines. Although insufficient data, homoplasy and incomplete lineage sorting might all have contributed to the discordance among gene trees, hybridization is favored as the main cause of the observed discordance. We propose that African colobines are paraphyletic, but might later have experienced female introgression from <it>Piliocolobus</it>/<it>Procolobus </it>into <it>Colobus</it>. In the late Miocene, colobines invaded Eurasia and diversified into several lineages. Among Asian colobines, <it>Semnopithecus </it>diverged first, indicating langur paraphyly. However, unidirectional gene flow from <it>Semnopithecus </it>into <it>Trachypithecus </it>via male introgression followed by nuclear swamping might have occurred until the earliest Pleistocene. Conclusions Overall, our study provides the most comprehensive view on colobine evolution to date and emphasizes that analyses of various molecular markers, such as mobile elements and sequence data from multiple loci, are crucial to better understand evolutionary relationships and to trace hybridization events. Our results also suggest that sex-specific dispersal patterns, promoted by a respective social organization of the species involved, can result in different hybridization scenarios.</p

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Louisiana State University

Publikationsserver des Robert Koch-Instituts

A Comparison of Genetic Programming and Neural Networks in Medical Data Analysis

Author: Markus Brameier
Wolfgang Banzhaf
Publication venue
Publication date: 01/01/1998
Field of study

We apply an interpreting variant of linear genetic programming to several diagnosis problems in medicine. We compare our results to results obtained with neural networks and argue that genetic programming is able to show similar performances in classi cation and generalization even when using a relatively small number of generations. Finally, an e cient algorithm for the elimination of introns in linear genetic programs is presented

CiteSeerX

Eldorado - Ressourcen aus und für Lehre, Studium und Forschung